Measuring Dialect Pronunciation Differences using Levenshtein Distance

نویسندگان

  • Wilbert Heeringa
  • Wilbert Jan Heeringa
  • G. De Schutter
چکیده

The work in this thesis has been carried out under the auspices of the Behavioral and Cognitive Neurosciences (BCN) research school, Groningen. Acknowledgements This thesis is attributed to exactly one author as can be seen on both the cover and the title pages. The author is the one who is responsible for the content. But it should be emphasized that many people contributed to the coming about of this thesis. I would like to mention them. First of all, I thank my promotor, John Nerbonne. More than five years ago, he encouraged me to work on the dialectometry project, and during the project he gave invaluable support. Without his support this thesis would never have been published. Probably just the pictures in this thesis will catch the eye of the reader. While I implemented programs for calculating distances between language varieties and for clustering them, Peter Kleiweg developed software for creating dendrograms, multidimensional scaling plots and different types of (color) maps. I am grateful to Peter for developing and making available this excellent software, and for his extensive help when creating the figures. During a visit on a cloudy afternoon, one of my best friends, Martin de Vries suggested that one seek speech segment distances on the basis of an acoustic representation. Becoming the inventor of a new type of voice-producing prosthesis, this approach seemed obvious to him. I thank him for this valuable suggestion. In cooperation with Roberto Bolognesi I worked on the comparison of Sardinian dialects. In this small project, the use of acoustic segment distances was developed and the use of the Levenshtein distance was improved. I thank Roberto for his help and his friendly cooperation. In the field of phonetics and phonology I got the help of many persons. I thank Paul Boersma and David Weenink for making available their excellent PRAAT program. Norwegian dialects play an important role in this thesis. Jørn Almberg made Norwegian recordings and transcriptions of the fable 'The North Wind and the Sun'. I am grateful to him for this his permission to use this material and for his help during the whole investigation. I thank Charlotte Gooskens for the good cooperation in our Norwegian research, and for her permission to use the results of her perception experiment in Norway. Thanks are due to Charlotte Gooskens v vi and Sabine Rosenhart for cutting the Norwegian word samples. I thank Arnold Dalen …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inducing Sound Segment Differences Using Pair Hidden Markov Models

Pair Hidden Markov Models (PairHMMs) are trained to align the pronunciation transcriptions of a large contemporary collection of Dutch dialect material, the GoemanTaeldeman-Van Reenen-Project (GTRP, collected 1980–1995). We focus on the question of how to incorporate information about sound segment distances to improve sequence distance measures for use in dialect comparison. PairHMMs induce se...

متن کامل

Measuring Norwegian dialect distances using acoustic features

Computational dialectometry has been proven to be useful for finding dialect relationships and identifying dialect areas. The first to develop a method of measuring dialect distances was Jean Séguy, assisted and inspired by Henri Guiter (Chambers and Trudgill, 1998). Strongly related to the methodology of Séguy is the work of Goebl, although the basis of Goebl’s work was developed mainly in dep...

متن کامل

Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data

The Levenshtein dialect distance method has proven to be a successful method for measuring phonetic distances between Dutch dialects. The aim of the present investigation is to validate the Levenshtein dialect distance with perceptual data from a language area other than the Dutch, namely Norway. We calculate the correlation between the Levenshtein distances and the distances between 15 Norwegi...

متن کامل

The Relative Divergence of Dutch Dialect Pronunciations from their Common Source: An Exploratory Study

In this paper we use the Reeks Nederlandse Dialectatlassen as a source for the reconstruction of a ‘proto-language’ of Dutch dialects. We used 360 dialects from locations in the Netherlands, the northern part of Belgium and French-Flanders. The density of dialect locations is about the same everywhere. For each dialect we reconstructed 85 words. For the reconstruction of vowels we used knowledg...

متن کامل

A cognitively grounded measure of pronunciation distance 1

10 In this study we develop pronunciation distances based on naive discriminative learning (NDL). 11 Measures of pronunciation distance are used in several subfields of linguistics, including 12 psycholinguistics, dialectology and typology. In contrast to the commonly used Levenshtein algorithm, 13 NDL is grounded in cognitive theory of competitive reinforcement learning and is able to generate...

متن کامل

Norwegian Dialects Examined Perceptually and Acoustically WILBERT HEERINGA

Gooskens (2003) described an experiment which determined linguistic distances between 15 Norwegian dialects as perceived by Norwegian listeners. The results are compared to Levenshtein distances, calculated on the basis of transcriptions (of the words) of the same recordings as used in the perception experiment. The Levenshtein distance is equal to the sum of the weights of the insertions, dele...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005